feat(telemetry): add streaming JSON export writer#960
Conversation
199d2e7 to
3a29ac9
Compare
9d7d4ce to
3c02990
Compare
3a29ac9 to
eb776a7
Compare
3c02990 to
350d662
Compare
eb776a7 to
ba095a9
Compare
ca1bd07 to
c0e517d
Compare
ba095a9 to
8066586
Compare
3d77d32 to
c9c3b74
Compare
d550ffd to
aca8405
Compare
c9c3b74 to
9c5a19d
Compare
9c5a19d to
914025b
Compare
|
/coder-agents-review |
There was a problem hiding this comment.
Solid extraction. The writeAtomically helper genuinely deduplicates the temp-file + rename + cleanup pattern, the streaming writer holds O(1) memory via pipeline, and the wire-format reuse through serializeTelemetryEvent prevents format drift. Tests are real (memfs-backed, non-vacuous assertions, success + empty + error paths). CI green.
Four P3s, one nit.
The sharpest finding is a convergent one: three reviewers independently flagged that "propagates midstream errors" only checks error propagation, not destination integrity. The PR description claims the latter. The atomicity property IS tested at the writeAtomically layer, but the integration boundary (streaming pipeline error -> atomic cleanup) is the interesting edge. Two lines fix it.
The cleanup-logging regression from the cliCredentialManager extraction is worth knowing. The old code logged temp-file cleanup failures; the new utility silently swallows them. The design choice is documented and reasonable (generic utility shouldn't take a logger), but on Windows where antivirus locks are real, orphaned .tmp-* files now accumulate without a trace.
"Huh, streaming with
highWaterMark: 1is clean. Memory stays flat regardless of export size." -- Killua
🤖 This review was automatically generated with Coder Agents.
|
/coder-agents-review |
034f5bc to
5a9edee
Compare
There was a problem hiding this comment.
All six R1 findings addressed cleanly in 034f5bc. The midstream test now verifies destination integrity, the onCleanupError callback restores diagnostic visibility, doc comments are accurate, and the suffix is consistent. Good fix round.
Two P3s on the new onCleanupError parameter itself.
The callback was the right design choice for DEREM-2, but the implementation has two gaps: it's untested (so the fix has no regression guard), and a throwing callback silently swallows the original write error despite the doc promising otherwise. Both are small fixes.
"The callback is the fix. An untested fix is an unreliable fix." -- Bisky
🤖 This review was automatically generated with Coder Agents.
1671ea2 to
5cbd8ba
Compare
Move `tempFilePath` and `renameWithRetry` from `src/util.ts` into a focused `src/util/fs.ts`, alongside a new `writeAtomically(path, write)` helper that captures the temp-file + rename + best-effort cleanup pattern previously copy-pasted into the telemetry writer and `CliCredentialManager.atomicWriteFile`. Both call sites now use the shared helper; the SSH config / support bundle / CLI manager keep their bespoke flows and just update imports. Rewrite `writeJsonArrayExport` to stream chunks through `Readable.from(...)` into `createWriteStream` via `stream/promises` `pipeline`. Memory stays flat regardless of `maxTotalBytes`, which has no enforced ceiling and can comfortably exceed the default 100 MB cap. Tests for `tempFilePath` and `renameWithRetry` move into `test/unit/util/fs.test.ts` alongside a new `writeAtomically` suite covering success, partial-write rollback, and callback return values.
- Move src/telemetry/export/writers.ts to writers/json.ts and the
matching test to writers/json.test.ts. The writer keeps the same
shape; the doc comment is tightened and the test drops a
redundant atomic-failure assertion already covered by
writeAtomically's own tests.
- Extract parseTelemetryTimestampMs from range.ts's
isTimestampInRange so other export writers can reuse it.
- Add a trivial asyncIterable test helper to test/mocks/.
- Inline the duplicated `vi.mock("node:fs", ...)` /
`vi.mock("node:fs/promises", ...)` factory pair (9 lines per file)
as a 2-line dynamic import in the affected test files.
Upstream-compatible groundwork for the OTLP writer branch (#961);
isolates move + helper-extraction churn so that PR's diff stays
focused on OTLP-specific work.
- Add a required onCleanupError callback to writeAtomically and writeJsonArrayExport so callers cannot silently lose temp-file cleanup diagnostics (matters on Windows, where AV/Indexer locks can leave large export orphans). cliCredentialManager passes its logger to restore the warn-on-cleanup behavior that existed before the extraction. - Wrap the rm+callback chain in a try/catch so a throwing onCleanupError can't replace the original write error. - Document that writeAtomically requires the parent directory to exist. - Use suffix "temp" so the call matches the doc example and the cliManager convention. - Cover the cleanup-failure path with new tests: callback invoked with the rm error, and a throwing callback still surfaces the writer's error. - Tighten the json writer's midstream-error test to assert the destination survives and no temp file is left behind. - Fix the asyncIterable helper's docstring: it exercises async iteration, not stream backpressure.
After the 999->888 takeover, the 888 monitor would eventually fail network reads, call processLost, then searchForProcess would find 888 again and emit ssh.process.recovered (same-PID rediscovery). On slow CI runners that race won the negative assertion at sshProcess.test.ts:469. Return [] from find after the takeover so no rediscovery is possible.
5cbd8ba to
d8a9b52
Compare
|
/coder-agents-review |
There was a problem hiding this comment.
All 8 prior findings addressed and verified. Netero clean, panel clean (Mafuuu, Bisky, Chopper). No new findings.
The onCleanupError callback evolution across rounds landed well: optional in R2, hardened against throwing callbacks in R3, now required so callers make an explicit choice about diagnostic visibility. The try/catch wrapping preserves the "original error always rethrown" contract. Tests cover all branches including the callback-throws path.
The sshProcess test fix (replacing mockResolvedValue with mockResolvedValueOnce + empty fallback) eliminates a race where the monitor loop could rediscover the replaced PID and emit a spurious recovery event.
🤖 This review was automatically generated with Coder Agents.
Summary
writeJsonArrayExportthat streams telemetry events to disk as a JSON array. Chunks are piped throughstream/promisespipelineso memory stays flat regardless ofmaxTotalBytes(default 100 MB, no enforced ceiling).src/util/fs.tsmodule hostingtempFilePath,renameWithRetry, and a newwriteAtomically(path, write, onCleanupError?)helper that captures the temp-file + rename + best-effort cleanup pattern.CliCredentialManager.atomicWriteFile(16 lines -> 4) and backs the export writer. SSH config / support bundle / CLI manager keep their bespoke flows and just update imports.Refs #903.
Stack: 2 / 4. Base: #959. Next: #961.
Why streaming
Export ranges include "All time", which reads everything on disk. With
maxFileBytesandmaxTotalBytesuser-tunable and unbounded, a buffered writer would peak at ~2x the export size in memory and OOM at high caps. Streaming holds memory constant at ~12 lines of code vs ~10 for the buffered version.Changes
src/telemetry/export/writers/json.ts(new):writeJsonArrayExportreturns the event count; usesserializeTelemetryEventfrom the shared wire format module so the output cannot drift from the on-disk shape.src/util/fs.ts(new):tempFilePath,renameWithRetry,writeAtomically(with optionalonCleanupErrorfor diagnosing orphaned temp files).src/util.ts: drops the moved helpers.src/core/cliCredentialManager.ts:atomicWriteFilenow defers towriteAtomicallyand passes its logger so cleanup failures are still logged.src/telemetry/export/range.ts: exportsparseTelemetryTimestampMsso the writer suite can build wire-format-equivalent events from the shared factory.test/unit/util/fs.test.ts(new): suite for the three helpers, including atomic-write success, partial-write rollback, and callback return values.test/unit/telemetry/export/writers/json.test.ts(new): usesmemfs, the sharedcreateTelemetryEventFactory, and the sharedserializeTelemetryEvent. Covers success, empty input, and a mid-stream failure leaving the destination untouched with no orphan temp file.test/unit/telemetry/export/files.test.ts: switched to a uniform memfs mock setup.test/mocks/asyncIterable.ts(new): shared helper that wraps a sync array as anAsyncIterableyielding one item per microtask.test/unit/util.test.ts: moved-out blocks removed.test/unit/core/supportBundleLogs.test.ts:vi.mocktarget retargeted to@/util/fs.